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Different scaling properties for the complexity of bidirectional synchronization and unidirectional 
learning are essential for the security of neural cryptography. Incrementing the synaptic depth of 
the networks increases the synchronization time only polynomially, but the success of the geometric 
attack is reduced exponentially and it clearly fails in the limit of infinite synaptic depth. This 
method is improved by adding a genetic algorithm, which selects the fittest neural networks. The 
probability of a successful genetic attack is calculated for different model parameters using numerical 
simulations. The results show that scaling laws observed in the case of other attacks hold for the 
improved algorithm, too. The number of networks needed for an effective attack grows exponentially 
with increasing synaptic depth. In addition, finite-size effects caused by Hebbian and anti-Hebbian 
learning are analyzed. These learning rules converge to the random walk rule if the synaptic depth 
is small compared to the square root of the system size. 

PACS numbers: 84.35.+i, 87.18.Sn, 89.70.+C 
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I. INTRODUCTION 

Neural cryptography Q, 0] is based on the effect that 
two neural networks are able to synchronize by mutual 
learning 0, H| • In each step of this online learning pro- 
cedure they receive a common input pattern and calcu- 
late their output. Then, both neural networks use those 
outputs presented by their partner to adjust their own 
weights. So, they act as teacher and student simultane- 
ously. Finally, this process leads to fully synchronized 
weight vectors. 

Synchronization of neural networks is, in fact, a com- 
plex dynamical process. The weights of the networks 
perform random walks, which are driven by a competi- 
tion of attractive and repulsive stochastic forces [f| . Two 
neural networks can increase the attractive effect of their 
moves by cooperating with each other. But, a third net- 
work which is only trained by the other two clearly has 
a disadvantage, because it cannot skip some repulsive 
steps. Therefore, bidirectional synchronization is much 
faster than unidirectional learning 0|. 

This effect can be applied to solve a cryptographic 
problem: Two partners A and B want to exchange a 
secret message. A encrypts the message to protect the 
content against an opponent E, who is listening to the 
communication. But, B needs A's key in order to de- 
crypt the message. Therefore, the partners have to use a 
cryptographic key-exchange protocol |fj in order to gen- 
erate a common secret key. This can be achieved by 
synchronizing two neural networks, one for A and one 
for B, respectively. The attacker E trains a third neu- 
ral network using inputs and outputs transmitted by the 
partners as examples. But, on average, learning is slower 
than synchronization. Thus, there is only a small proba- 
bility Pe that E is successful before A and B synchronize 




While other cryptographic algorithms use complicated 
calculations based on number theory 0, the neural key- 
exchange protocol only needs basic mathematical oper- 
ations, namely adding and subtracting integer numbers. 
These can be realized efficiently in integrated circuits. 
Computer scientists are already working on an hardware 
implementation of neural cryptography 0, H, 0, El • 

Since the first proposal [l| of the neural key-exchange 
protocol, improved str ateg ies for the attackers 0, 0] 
and the partners Pa, fl3L Il4| have been suggested and an- 
alyzed |3. ll5llla . ll7j . For the geometric attack it has been 
found that the synaptic depth L determines the security 
of the system: the success probability Pe decreases ex- 
ponentially with L, while the synchronization time t syac 
increases only proportionally to L 2 [TtI ITsj. Therefore, 
any desired level of security against this attack can be 
reached by increasing L. 

An improved version of this method is the majority at- 
tack |l2j . Here a group of M neural networks estimates 
the output of -B's hidden units. But, instead of updating 
the weights individually, £"s tree parity machines coop- 
erate and adjust the weight vectors in the same way ac- 
cording to the majority vote. While using this method 
increases Pe, the scaling laws hold except for one spe- 
cial learning rule and random inputs |l2Ul4|. Therefore, 
neural cryptography is secure against this attack in the 
limit L — ► oo , too. 

In this paper we analyze a different method for the 
opponent E. The genetic attack is not based on op- 
timal learning like the majority attack |l2j| . but employs 
a genetic algorithm in order to select the most successful 
of £"s neural networks. First, we repeat the definition 
of the neural key-exchange protocol in Sec. |H] We also 
explain why A and B have a clear advantage over E. The 
algorithm of the genetic attack is presented in Sec. IIIII 
Here, we show that the scaling behavior observed for the 
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FIG. 1: Tree parity machine with K = 3 and N = 4. 



geometric attack and the majority attack also holds for 
the genetic attack. In Sec. II VI we analyze the influence of 
the learning rules on synchronization and learning. Fi- 
nally, the known attacks on the neural key-exchange pro- 
tocol are compared regarding their efficiency. The results 
presented in Sec. |V] show that the genetic attack is less 
efficient than the majority attack except for some special 
cases. 



II. NEURAL CRYPTOGRAPHY 

-epe 

key-exchange protocol Each partner, A and B, uses a 
tree parity machine. The structure of this neural network 
is shown in Fig. A tree parity machine consists of K 
hidden units, which work like perceptrons. The possible 
input values are binary, 



e {-!,+!} 



(1) 



and the weights are discrete numbers between — L and 
+L, 



E {-L,-L + 1,...,L-1,L}. 



(2) 



Here the index i — 1, . . . , K denotes the ith hidden unit 
of the tree parity machine and j = 1, . . . , N the elements 
in each vector. The output of the first layer is defined as 
the sign of the scalar product of inputs and weights, 



(Ji = sgn(w l • Xi) 



(3) 



And, the total output of the tree parity machine is given 
by the product (parity) of the hidden units, 



A' 



(4) 



At the beginning of the synchronization process A and 
B initialize the weights of their neural networks ran- 
domly. This initial state is kept secret. In each time step 
t, K random input vectors x^ are generated publicly and 
the partners calculate the outputs t a and t b of their 
tree parity machines. After communicating the output 
bits to each other they update the weights according to 
one of the following learning rules: 



(i) Hebbian learning 

w+ = Wi + ( 7 4 x 2 e( ( 7 2 T)e(r A r s ) , (5) 

(ii) Anti-Hebbian learning 

w+ = Wi - ( 7 4 x 4 e(a 4 T)e(r A T S ) , (6) 



(iii) Random walk 

w+ =w l +x i e(a l r)e(r A T B ). 



(7) 



If any component of the weight vectors moves out of the 
range — L, . . . , +L, it is replaced by the nearest boundary 
value, cither — L or +L. 

After some time i S ync the partners have synchronized 
their tree parity machines, w^(i sync ) = w.f (i sync ), and 
the process is stopped. Afterwards, A and B can use 
the weight vectors as a common secret key in order to 
encrypt and decrypt secret messages. 

We describe the process of synchronization by standard 
order parameters, which are also used for the analysis of 
online learning |l9| . These order parameters are 



^ = 7T 
1 

= 1 

N 



R 



(8) 
(9) 



where the indices m, n S {A, B, E} denote A's, -B's or 
_E's tree parity machine, respectively. The level of syn- 
chronization between two corresponding hidden units is 
defined by the (normalized) overlap, 



(10) 



Uncorrelated weight vectors have p — 0, while the maxi- 
mum value p = 1 is reached for full synchronization. 

The overlap between two corresponding hidden units 
increases if the weights of both neural networks are up- 
dated in the same way. Coordinated moves, which occur 
for identical Ci, have an attractive effect. 

Changing the weights in only one hidden unit decreases 
the overlap on average. These repulsive steps can only 
occur if the two output values Ui are different. The prob- 
ability for this event is given by the well-known general- 
ization error of the perceptron, 



1 

€i = — arccos pi , 

7T 



(11) 



which itself is a function of the overlap pi between the 
hidden units. For an attacker who simply trains a third 
tree parity machine using the examples generated by A 
and B, repulsive steps occur with probability = e,;, 
because E cannot influence the process of synchroniza- 
tion. 

In contrast, A and B communicate with each other and 
are able to interact. If they disagree on the total output, 
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FIG. 2: Probability P r of repulsive steps as a function of the 
generalization error e. The inset shows the probability P g for 
a successful geometric correction. 



FIG. 3: Synchronization time t syn c as a function of H for 
K — 3, N — 1000, random walk learning rule, and different 
values of L, averaged over 10 000 simulations. 



there is at least one hidden unit with af ^ af . As an 
update would have a repulsive effect, the partners just 
do not change the weights. In doing so, A and B reduce 
the probability of repulsive steps in their hidden units. 
For K — 3 and identical generalization error, = e, we 
find [| 



P B = 



2(1 



(l-e) 3 + 3(l-e)e 2 



< e = Pf 



(12) 



Therefore, the partners have a clear advantage over an 
attacker using only simple learning. 

But, E can use a more advanced method called geo- 
metric attack. As before, she trains a third tree parity 
machine, which has the same structure as ^4's and B's. 
In each step t e is calculated and compared to t b . As 
long as these output values are identical, E can apply the 
learning rule in the same manner as B. But, if t e ^ t b , 
the attacker has to correct this deviation before updating 
the weights. 

For this purpose E uses the local field 



1 



N 



(13) 



of her hidden units as additional information. Then, the 
probability of af ^ af is given by the prediction error 
of the perceptron [2(| 



ti{pi, hi) = - 



1 — erf 



Pi 



(14) 



If the local field hi is zero, the neural network has no in- 
formation about the input vector Xj , because it is perpen- 
dicular to the weight vector Wj. In this case the predic- 
tion error reaches its global maximum of ei{pi, 0) = 1/2. 

The prediction error ei{pi,h{) is a strictly monotonic 
decreasing function of \hi\. Therefore, the attacker 
searches the hidden unit with the lowest value of the 
absolute local field \hf\ and flips the sign of af . This 



results in t e = t b and the learning rule can be applied. 
But, the geometric attack does not always find the cor- 
rect hidden unit which caused the deviation of the total 
output bits. If af ^ af in the ith hidden unit and 
af = af in all other hidden units, E flips the sign of af 
with probability 



Pn 




arccos pj 



x J— %KH% ' e 2 «.d/ii. 
V Qi arccos ^ 



(15) 



Thus, the geometric attacker avoids some repulsive steps, 
although they still occur more frequently than in the 
partners' tree parity machines. 

In the case of identical generalization error a — e and 
K = 3, we find that the probability of repulsive steps, 



P r E = 2(l-P 5 )(l-e) 2 e + 2(l-e) £ 2 



— e 
3 



(16) 



is higher than P B , but lower than P E = e for simple 
learning. This result is clearly visible in Fig. [5] That is 
why learning by listening is slower than mutual learning, 
even for advanced algorithms. This effect makes neural 
cryptography feasible and prevents successful attacks in 
the limit L — > oo. 

Recently, it has been discovered that the security of the 
neural key-exchange protocol can be improved by using 
queries instead of random inputs 0, 0] . The partners 
ask questions to each other which depend on their own 
weight vectors Wj and an additional public parameter H. 
In odd (even) steps A (B) generates K input vectors Xj 
with hf w ±H (hf w ±H). So, the absolute value of 
the local field hi is given by H, while its sign ai is chosen 
randomly. 

Queries change the relation between the overlap and 
the frequency P r of repulsive steps. The probability of 
different outputs ai in corresponding hidden units is now 
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FIG. 4: Success probability Pe of the genetic attack for 
K = 3, N = 1000, random walk learning rule, and M = 4096. 
Symbols represent results obtained from 1000 simulations, 
and the lines show a fit with Eq. 1171 . 



FIG. 5: Parameters /i and /3 as a function of the synaptic 
depth L. Symbols denote results of fitting simulation data 
for different M with Eq. (1171 and the lines were calculated 
using the model given in Eq. 119H . 



given by Eq. I|14l) instead of Eq. Qllfl. because the ab- 
solute local field in A's or _B's hidden units is known. 
Consequently, the partners can optimize complexity and 
security of the neural key-exchange protocol by adjusting 
H and L suitably [T^ |. 

As shown in Fig. [3] a minimum value of H is needed in 
order to achieve synchronization in a reasonable number 
of steps. If H > a c L, t sync increases proportional to 
L? IniV, but for H < a c L it diverges [l4lll8| . In the case 
of the random walk learning rule we estimate a y ~ 0.31 
by using the extrapolation method described in [14j. 
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III. GENETIC ATTACK 

For the genetic attack the opponent starts with 
only one tree parity machine, but she can use up to M 
neural networks. As before E calculates the output of her 
networks in each step. Afterwards the following genetic 
algorithm is applied: 

(i) If t a = t b and E has at most M/2 K ~ 1 tree parity 
machines, she determines all 2 K ~ 1 internal repre- 
sentations (erf', . . • which reproduce the out- 
put t a . Then, these are used to update the weights 
in £"s neural networks according to the learning 
rule, so that 2 K ~ l variants of each tree parity ma- 
chine are generated. 

(ii) But, if E already has more than M/2 K ~ 1 neural 
networks, the mutation step described above is not 
possible. Instead of that the attacker discards all 
tree parity machines which predicted less than U 
outputs t a in the last V learning steps, with t a = 
t b , successfully. In our simulations we use a limit 
of U = 10 and a history of V = 20 as default values. 
Additionally, at least 20 neural networks are kept 
in such a selection step. 



FIG. 6: Offset 5 as a function of the number of attackers M, 
for K = 3, N = 1000, and the random walk learning rule. 
Symbols and the line were obtained by a fit with Eq. 1191 . 



remain unchanged, because A and B do not update 
the weights in their tree parity machines. 

The attack is considered successful if at least one of E's 
neural networks has synchronized 98% of the weights be- 
fore the end of the key exchange. We use this relaxed 
criterion in order to decrease the fluctuations of Pe 01 • 
The success probability of the genetic attack strongly 
depends on the value of the parameter H. This effect is 
clearly visible in Fig. 0] In order to determine Pe as a 
function of H , a Fermi-Dirac distribution 



Pe = 



1 



1 + exp[-/3(# - //)] 



(17) 



with two parameters f3 and fj, can be used as a fitting 
model. This equation is also valid for the geometric at- 
tack and the majority attack [l^j . 

Figure [S] shows the results of the fits using Eq. {T7J|. 
While (3 is nearly independent of L and M, /j, increases 
linearly with the synaptic depth, 



(iii) In the case of t a ^ t b the attacker's networks 



(18) 
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FIG. 7: Success probability of the genetic attack as a function 
of the synchronization time for K = 3, TV = 1000, random 
walk learning rule, M — 4096 and different values of L. The 
dashed line shows the envelope of this set of curves. 



FIG. 8: Success probability of the genetic attack for K = 3, 
L = 7, N = 1000, random walk learning rule, M = 4096, and 
H = 2.28. These results were obtained by averaging over 100 
simulations. 



Obviously, the attacker can change the offset S, but not 
a s , by using more resources. As shown in Fig.[5].E needs 
to double M in order to decrease 5 by a fixed amount 
7 In 2. Thus, /i is a linear function of both L and InM, 

fj, = a s L — 7 In M + fj,o . (19) 

Substituting Eq. (TJJ into Eq. (TZJ leads to 

1 (20) 



1 + cxp[/3(^ - 7 In M)] exp[ 



t)L] 



for the success probability of the genetic attack as a func- 
tion of a = H/L, the synaptic depth L, and the maximal 
number of attackers M. 

From these results we can deduce the scaling of Pe 
with regard to L and M. For large values of the synaptic 
depth the asymptotic behavior is given by 



P E ~ e -/3(Mo-7lnAf) e -/3(Q s -a)L 



(21) 



as long as a < a s . 

This equation shows that that the partners have a 
great advantage over an attacker. If A and B increase L, 
the success probability drops exponentially, 



Pe oc e 



-f3(a s -a)L 



(22) 



while the complexity of the synchronization rises only 
polynomially. This is clearly visible if one looks at the 
function -Pe ((i S ync)), which is shown in Fig. [7] Due to 
the offset S in Eq. (|18|) the attacker is successful for small 
values of L. But, for larger synaptic depth optimal se- 
curity is reached for values of H and L, which lie on 
the envelope of Pe((*sync))- This curve is approximately 
given by H = a c L, as this condition maximizes a s — a 
while synchronization is still possible . 

In contrast, the attacker has to increase the number of 
her tree parity machines exponentially, 



M oc e [(« s - Q )/-r]i 



(23) 



in order to compensate a change of L and maintain a 
constant success probability Pe- But, this is usually not 
possible due to limited computer power. 

Alternatively, the attacker could try to optimize the 
other two parameters of the genetic attack. As shown 
in Fig. IS] E obtains the best result if she uses U = 30, 
V = 50 instead of U = 10, V = 20. Figure shows that 
this modification leads to a lower value of 0, but does not 
influence n(L). Therefore, E gains little, as the scaling 
relation l|23() is not affected. That is why A and B can 
easily reach an arbitrary level of security. 



IV. LEARNING RULES 

Beside the random walk learning rule J7J) used so far, 
there are two other suitable algorithms for updating the 
weights: the Hebbian learning rule JSJ and the anti- 
Hebbian learning rule JSJl. The only difference between 
these three rules is whether and how the output <ii of 
the hidden unit is included in the update step. But, this 
causes some effects which we discuss in this section. 

In the case of the Hebbian rule A's and -B's tree parity 
learn their own output. Therefore, the direction in which 
the weight Wij moves is determined by the product UiXij. 
But, as the output of a hidden units is a function of all 
input values, there are correlations between Xij and Oi. 
That is why the probability distribution of UiXij is not 
uniformly distributed in the case of random inputs, but 
depends on the corresponding weight Wij , 



P{aiXij = 1) = - 



1 + erf 



'NQi 



(24) 



According to this equation, OiXij — sgn(u>y ) occurs more 



often than criXj 



-sgn(wjj). Thus, the Hebbian learn- 



ing rule JSJ) pushes the weights towards the boundaries 
at — L and +L. 
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FIG. 9: Length of the weight vectors in the steady state for 
K = 3 and TV = 1000. Symbols denote results averaged over 
1000 simulations and lines show the first-order approximation 
given in Eq. JUJ and Eq. I^Tl . 



FIG. 10: Parameter /j, and j3 as a function of L for the genetic 
attack with K = 3, N = 1000, and M = 4096. The symbols 
represent results from 1000 simulations and the lines show a 
fit using the model given in Eq. 11911 . 



In order to quantify this effect we calculate the sta- 
tionary probability distribution of the weights. Using 
Eq. I|24|) for the transition probabilities leads to 



P(iVij = to) =p a Yl 



1 + erf 



m— 1 



y'jVQi-fm-l) 2 



rn=l 1 _ cr f 



(25) 



whereas the normalization constant po is given by 



Po 



e n 



1 + crf 



^NQi-( m -\y 



-—Lm=i 1 — erf 



y/NQi- 



(26) 



/ 



In the limit N — > oo the argument of the error function 
vanishes and the weights are uniformly distributed. In 
this case the synchronization process does not change 
the initial length 



y/Qi(t = 0) = \j 



L(L + l) 



(27) 



of the weight vector. 

But, for finite N the probability distribution (|25|l it- 
self depends on the order parameter Qi. Therefore, the 
expectation value of Qi is the solution of the following 
equation: 



Qi 



(28) 



w=—L 



By expanding Eq. (|28|l in terms of N 1 / 2 we obtain 
L(L+1) 



8L 4 + 16L 3 - WL 2 -18L + 9 1 



I5y/3TVL(L+1) 



N 



< 



(29) 



as a first-order approximation of Qi for large system sizes. 
In the case of 1 -C L <C V~N the asymptotic behavior of 
this order parameter is given by 



L(L + 1) 



1 



L 



(30) 



Obviously, the application of the Hebbian learning rule 
increases the length of the weight vectors until a 
steady state is reached. Additionally, the changed proba- 
bility distribution of the weights affects the synchroniza- 
tion process and the success of attacks. That is why one 
encounters finite-size effects if L/y/N is large 

In the case of the anti-Hebbian rule A's and B's tree 
parity machines learn the opposite of their own outputs. 
Therefore, the weights are pulled away from the bound- 
aries, so that 



Q l - 



L(L + 1) 



8L 4 



16L 6 - 101/ - 18L + 9 1 



15y/3wL(L + 1) 



N 



L(L + l) 



1 



L 



5V3tt VN 



(31) 
(32) 



for 1 -c L <C yN. Here, the length of the weight vectors 
Wj is decreased. 

In contrast, the random walk learning rule always uses 
a fixed set output. Here, the weights stay uniformly dis- 
tributed, because only the random input values Xij de- 
termine the direction of the movements. In this case the 
length of the weight vectors is given by Eq. I|27|l. 

Figure El shows that the theoretical predictions are in 
good quantitative agreement with simulation results as 
long as L 2 is small compared to the system size N. The 
deviations for large L are caused by higher-order terms 
which are ignored in Eq. I|29|l and Eq. I|31|l . 
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The choice of the learning rule affects synchronization 
with random inputs as well as with queries. As the pre- 
diction error (|14f> is a function of hi/y/Qi, this ratio in- 
stead of just the local field determines the behavior of 
the system. That is why there are different values of a c 
and a s for each learning rule, which is shown in Fig. 1101 
and Fig. EU 

In the limit N — > oo, however, a system using Hebbian 
or anti-Hebbian learning exhibits the same dynamics as 
observed in the case of the random walk rule for all sys- 
tem sizes. This is clearly visible in Fig. II II Consequently, 
one can determine the properties of neural cryptography 
in the limit N — > oo without actually analyzing very large 
systems. It is sufficient to use the random walk learning 
rule and moderate values of N in simulations. 




FIG. 12: Success probability Pe of the geometric attack with 
M — 1, the majority attack with M = 100, and the genetic 
attack with M = 4096, for K = 3, N = 1000, and the random 
walk learning rule. Symbols represent results averaged over 
1000 simulations, in part (a) for random inputs and in part (b) 
for queries with H — 0.32L. The lines were obtained by fitting 
with Eq. £2). 



V. SECURITY 

In order to assess the security of the neural key- 
exchange protocol one has to consider all known attacks. 
Therefore, we compare the efficiency of several methods 
here. 

Figure E| shows that the success probability Pe drops 
exponentially with increasing synaptic depth L, 

P E ~ e- yl - L - L ^ , (33) 

as long as L > Lq. While this scaling behavior is the 
same for all attacks, the constants y and Lq are different 
for each method. 

The geometric attack is the simplest method consid- 
ered here. E only needs one tree parity machine, but the 
success probability Pe is lower than for the advanced 
methods. As the exponent y is large, the two partners 
can easily secure the neural key-exchange protocol by in- 
creasing the synaptic depth |l7j . 

In the case of the majority attack Pe is higher, be- 
cause the cooperation between E's tree parity machines 
reduces the coefficient y. A and B have to compensate 
this by further stepping up L. In contrast, the genetic 



attack increases L , while y does not change significantly 
compared to the geometric attack. Therefore, the genetic 
algorithm is better only if L is not too large. Otherwise 
E gains most by using the majority attack. 

As shown in Fig.^|the partners can improve the secu- 
rity of the key-exchange protocol against all three attacks 
by using queries. However, the majority attack remains 
the most efficient of i£'s methods. 

We note that these results are based on numerical ex- 
trapolations of the success probability Pe. While ana- 
lytical evidence for the complexity of a successful attack 
would be desirable, it is not available yet in the case 
of the nondeterministic methods with Pe < 1 discussed 
above. But there are only two successful deterministic 
algorithms for E known at present: a brute- force attack 
or a genetic attack with M — 2( Ar ~ 1 )* s >' nc networks. The 
complexity of these attacks clearly grows exponentially 
with increasing L. Therefore, breaking the security of 
neural cryptography belongs to the complexity class NP 
(nondeterministic polynomial time), but we cannot prove 
that it is not in P (polynomial time). This situation is 
similar to that of other cryptographic protocols, e.g., the 
DifHe-Hellman key exchange 
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VI. CONCLUSIONS 

The security of cryptographic algorithms is usually 
based on different scaling laws regarding the computa- 
tional complexity for users and attackers. By changing 
some parameter one can increase the cost of a successful 
attack exponentially, while the effort for the users incre- 
ments only polynomially. For conventional cryptographic 
systems this parameter is the length of the key. In the 
case of neural cryptography it is the synaptic depth L of 
the neural networks. 

As the neural key-exchange protocol uses tree parity 
machines, an attacker faces the challenge to guess the in- 
ternal representation of these networks correctly. Learn- 
ing alone is not sufficient to solve this problem. Other- 
wise the scaling laws hold and the partners can achieve 
any desired level of security by increasing L. 

We have analyzed an attack, which combines learn- 
ing with a genetic algorithm. We have found that this 
method is very successful as long as L is small. But, at- 
tackers have to increase the number of their neural net- 
works exponentially in order to compensate higher values 



of L. That is why neural cryptography is secure against 
the genetic attack as well. 

This method achieves the best success probability of 
all known methods only if the synaptic depth L is not 
too large. For higher values of L the attacker gains more 
by using the majority attack. But, both methods are 
unable to break the security of the neural key-exchange 
protocol in the limit L — > oo. 

Additionally, we have studied the influence of different 
learning rules on the neural key-exchange protocol. Hcb- 
bian and anti-Hebbian learning change the order param- 
eter Q, which is related to the length of the weight vec- 
tors. If the system size N is small compared to L 2 , this 
causes finite-size effects. But, in the limit Lj\[N — ► 
the behavior of all learning rules converges to that of the 
random walk rule. 

Based on our results, we conclude that the neu- 
ral key-exchange protocol is secure against all attacks 
known up to now. But- similar to other cryptographic 
algorithms — there is always a possibility that a clever 
method may be found which destroys the security of neu- 
ral cryptography completely. 
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